Git and version control

Author

Murray Logan

Published

January 19, 2024

This tutorial will take a modular approach. The first section will provide an overview of the basic concepts of git. The second section will provide a quick overview of basic usage and the third and final section will cover intermediate level usage. In an attempt to ease understanding, the tutorial will blend together git commands and output, schematic diagrams and commentary in an attempt to ease understanding.

The following table surves as both a key and overview of the most common actions and git ‘verbs’.

Initialize git
git init Establish a git repository (within the current path if no path provided)
Staging
git add <file>
where file is one or more files to stage
Staging is indicating which files and their states are to be included in the next commit.
Committing
git commit -m "<Commit message>"
where <Commit message> is a message to accompany the commit
Commiting generates a ‘snapshot’ of the file system.
Checkout
git checkout "<commit>"
where <commit> is a reference to a commit to be reviewed
Explore the state associated with a specific commit
Reset
git reset --hard "<commit>"
where <commit> is a reference to a commit
Return to a previous state, effectively erasing subsequent commits..
Revert
git revert "<commit>"
where <commit> is a reference to a commit that should be nullified (inverted)
Generate a new commit that reverses the changes introduced by a commit thereby effectively rolling back to a previous state (the one prior to the nominated commit) whilst still maintaining full commit history.
Branching
git branch <name>
git checkout <name>
where <name> is a reference to a branch name (e.g. ‘Feature’)
Take edits in the project in a new direction to allow for modifications that will not affect the main (master) branch.
Merging
git checkout master
git branch <name>
where <name> is a reference to a branch name (e.g. ‘Feature’) that is to be merged back into master.
Incorporate changes in a branch into another branch (typically master).
Rebasing
git rebase -i HEAD~<number>
where <number> is the number of previous commits to squash together with head.
Combine multiple commits together into a single larger commit.
Pulling
git pull -u <remote> <branch>
where <remote> is the name of the remote (typically origin) and <branch> is the branch to sync with remote (typically master).
Pull changes from a branch of a remote repository.
Pushing
git push -u <remote> <branch>
where <remote> is the name of the remote (typically origin) and <branch> is the branch to sync with remote (typically master).
Push changes up to a branch of a remote repository.

1 Context

Git is a distributed versioning system. This means that the complete contents and history of a repository (in simplistic terms a repository is a collection of files and associated metadata) can be completely duplicated across multiple locations.

No doubt you have previously been working on a file (could be a document, spreadsheet, script or any other type of file) and got to a point where you have thought that you are starting to make edits that substantially change the file and therefore have considered saving the new file with a new name that indicates that it is a new version.

In the above diagram, new content is indicated in red and modifications in blue.

Whist this approach is ok, it is fairly limited and unsophisticated approach to versioning (keeping multiple versions of a file). Firstly, if you edit this file over many sessions and each time save with a different name, it becomes very difficult to either keep tract of what changes are associated with each version of the file, or the order in which the changes were made. This is massively compounded if a project comprises multiple files or has multiple authors.

Instead, imagine a system in which you could take a snapshot of state of your files and also provide a description outlining what changes you have made. Now imagine that the system was able to store and keep track of a succession of such versions in such a way that allows you to roll back to any previous versions of the files and exchange the entire history of changes with others collaborators - that is the purpose of git.

In the above diagram (which I must point out is not actually how git works), you can see that we are keeping track of multiple documents and potentially multiple changes within each document. What constitutes a version (as in how many changes and to what files) is completely arbitrary. Each individual edit can define a separate version.

One of the issues with the above system is that there is a lot of redundancy. With each new version an addition copy of the project’s entire filesystem (all its files) must be stored. In the above case, Version 2 and 3 both contain identical copies of fileA.doc. Is there a way of reducing the required size of the snapshots by only keeping copies of those that have actually changed? this is what git achieves. Git versions (or snapshots known as commits) store files that have changed since the previous and files that have not changed are only represented by links to instances of these files within previous snapshots.

Now consider the following:

  • You might have noticed that a new version can comprise multiple changes across multiple files. However, what if we have made numerous changes to numerous files over the course of an editing session (perhaps simultaneously addressing multiple different editing suggestions at a time), yet we did not want to lump all of these changes together into a single save point (snapshot). For example, the multiple changes might constitute addressing three independent issues, so although all edits were made simultaneously, we wish to record and describe the changes in three separate snapshots.

  • What if this project had multiple contributors some of whom are working on new components of the project and some whom are working simultaneously on the same set of files? How can the system ensure that all contributors are in sync with each other and that new components are only introduced to the project proper once they are stable and agreed upon?

  • What if there are files present within our project that we do not wish to keep track of. These files could be log files, compilation intermediates etc.

  • Given that projects can comprise many files (some of which can be large), is it possible to store compressed files so as to reduce the storage and bandwidth burden?

2 Overview of git

The above discussion provides context for understanding how git works. Within git, files can exist in one of four states:

  • untracked - these are files within the directory tree that are not to be included in the repository (not part of any snapshot)
  • modified - these are files that have changed since the last snapshot
  • staged - these are files that are nominated to be part of the next snapshot
  • committed - these are files that are represented in a stored snapshot (called a commit). One a snapshot is committed, it is a permanent part of the repositories history

Since untracked files are not part of a repository, we will ignore these for now.

Conceptually, there are three main sections of a repository:

  • Working directory - (or Workspace) is the obvious tree (set of files and folders) that is present on disc and comprises the actual files that you directly create, edit etc.
  • Staging area - (or index) is a hidden file that contains metadata about the files to be included in the next snapshot (commit)
  • Repository - the snapshots (commits). The commits are themselves just additional metadata pointing to a particular snapshot.

A superficial representation of some aspects of the git version control system follows. Here, the physical file tree in the workspace can be added to the staging area before this snapshot can be committed to the local repository.

After we add the two files (file 1 and file 2), both files will be considered in an untracked state. Adding the files to the staging area changes their state to staged. Finally when we commit, the files are in a committed state.

Now if we add another file (file 3) to our workspace, add this file to the staging area and then commit the change, the resulting committed snapshot in the local repository will resemble the workspace. Note, although the staging area contains all three files, only file 3 points to any new internal content - since file 1 and file 2 have unmodified, their instances in the staging area point to the same instances as previous. Similarly, the second commit in the Local repository will point to one new representation (associated with file 3) and two previous representations (associated with file 1 and file 2).

Initially, it might seem that there is an awful lot of duplication going on. For example, if we make a minor alteration to a file, why not just commit the change (delta) instead of an entirely new copy? Well, periodically, git will perform garbage collection on the repository. This process repacks the objects together into a single object that comprises only the original blobs and their subsequent deltas - thereby gaining efficiency. The process of garbage collection can also be forced at any time via:

git gc

During the evolution of most projects, situations arise in which we wish to start work on new components or features that might represent a substantial deviation from the main line of evolution. Often, we would very much like to be able to quarantine the main thread of the project from these new developments. For example, we may wish to be able to continue tweaking the main project files (in order to address minor issues and bugs), while at the same time, performing major edits that take the project in a different direction.

This is called branching. The main evolutionary thread of the project is referred to as the main branch. Deviations from the main branch are generally called branches and can be given any name (other than ‘main’ or ‘HEAD’). For example, we could start a new branch called ‘Feature’ where we can evolve the project in one direction whilst still being able to actively develop the main branch at the same time. ‘Feature’ and ‘main’ branches are depicted in the left hand sequence of circles of the schematic below.

The circles represent commits (stored snapshots). We can see that the first commit is the common ancestor of the ‘Feature’ and ‘main’ branch. HEAD is a special reference that points to the tip of the currently active commit. It indicates where the next commit will be built onto. In diagram above, HEAD is pointing to the last commit in main. Hence the next commit will build on this commit. To develop the Feature branch further, we first have to move HEAD to the tip of the Feature branch.

We can later merge the Feature branch into the main branch in order to make the new changes mainstream.

To support collaboration, there can also be a remote repository (referred to as origin and depicted by the squares in the figure above). Unlike a local repository, a remote repository does not contain a workspace as files are not directly edited in the remote repository. Instead, the remote repository acts as a permanently available conduit between multiple contributors.

In the diagram above, we can see that the remote repository (origin) has an additional branch (in this called dev). The collaborator whose local repository is depicted above has either not yet obtained (pulled) this branch or has elected not to (as perhaps it is not a direction that they are involved in).

We also see that the main branch on the remote repository has a newer (additional) commit than the local repository.

Prior to working on branch a collaborator should first get any updates to the remote repository. This is a two step process. Firstly, the collaborator fetches any changes and then secondly merges those changes into their version of the branch. Collectively, these two actions are called a pull.

To make local changes available to others, the collaborator can push commits up to the remote repository. The pushed changes are applied directly to the nominated branch so it is the users responsibility to ensure as much as possible, their local repository already included the most recent remote repository changes (by always pulling before pushing).

3 Installation

Git Bash (Command Line Version):

  1. Download the Git for Windows installer from Git for Windows
    • Click the Download button
    • Select the latest version from the list of Assets
  2. Run the installer and follow the installation prompts.
  3. Choose the default options unless you have specific preferences.
  4. Select the default text editor (usually Vim) or choose another editor like Nano or Notepad++.
  5. Choose to use Git from the Windows Command Prompt (recommended).
  6. Complete the installation.

Using Homebrew:

  1. Open Terminal.
  2. Install Homebrew if not installed:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  1. Install Git using Homebrew:
brew install git
  1. Open Terminal.

Ubuntu/Debian:

sudo apt update
sudo apt install git

Fedora:

sudo dnf install git

Arch Linux:

sudo pacman -S git

Linux (Red Hat/CentOS):

sudo yum install git

To verify that the software is installed and accessible, open a terminal and issue the following:

git --version
git version 2.43.0

Windows:

On Windows, you can access a terminal via one of the following:

  • via the command Prompt:
    • Press Win + R to open the Run dialog.
    • Type cmd and press Enter.
  • via PowerShell:
    • Press Win + X and select “Windows PowerShell.”
  • Git Bash (Optional):
    • if Git is installed (which we are hoping it is!), open “Git Bash” for a Unix-like terminal experience.

MacOS:

  • via Terminal:
    • Press Cmd + Space to open Spotlight.
    • Type terminal and press Enter.

Linux:

Oh please. You cannot seriously tell me that you are using Linux and don’t know how to access a terminal.

In the command above, pay particular attention to the number of hyphens in the above command - there are two in a row and no spaces between the -- and the word version.

If you get output similar to above (an indication of what version of git you have on your system), then it is likely to be properly installed. If instead you get an error message, then it is likely that git is not properly installed and you should try again.

4 Getting started

Before using git, it is a good idea to define some global (applied to all your gits) settings. These include your name and email address and whilst not essential, they are applied to all actions you perform so the it is easier for others to track the route of changes etc.

git config --global user.name "Your Name"
git config --global user.email "your_email@whatever.com"
Note

In the above, you should replace “Your Name” with your actual name. This need not be a username (or even a real name) it is not cross referenced anywhere. It is simply to use in collaboration so that your collaborators know who is responsible for your commits.

Similarly, you should replace “your_email@whatever.com” with an email that you are likely to monitor. This need not be the same email address you have used to register a Github account etc, it is just so that collaborators have a way of contacting you.

The remaining sections go through the major git versioning concepts. As previously indicated, git is a command driven program (technically a family of programs). Nevertheless, many other applications (such as RStudio) are able to interface directly with git for some of the more commonly used features. Hence, in addition to providing the command line syntax for performing each task, where possible, this tutorial will also provide instructions (with screen captures) for RStudio and emacs.

5 Setting up (initializing) a new repository

For the purpose of this tutorial, I will create a temporary folder the tmp folder of my home directory into which to create and manipulate repositories. To follow along with this tutorial, you are encouraged to do similarly.

5.1 Initialize local repository

We will start by creating a new directory (folder) which we will call Repo1 in which to place our repository. All usual directory naming rules apply since it is just a regular directory.

mkdir ~/tmp/Repo1

To create (or initialize) a new local repository, issue the git init command in the root of the working directory you wish to contain the git repository. This can be either an empty directory or contain an existing directory/file structure. The git init command will add a folder called .git to the directory. This is a one time operation.

cd ~/tmp/Repo1
git init 
Initialized empty Git repository in /home/runner/tmp/Repo1/.git/

The .git folder contains all the necessary metadata to manage the repository.

ls -al
total 12
drwxr-xr-x 3 runner docker 4096 Jan 19 04:44 .
drwxr-xr-x 3 runner docker 4096 Jan 19 04:44 ..
drwxr-xr-x 7 runner docker 4096 Jan 19 04:44 .git
tree -a --charset unicode
.
`-- .git
    |-- HEAD
    |-- branches
    |-- config
    |-- description
    |-- hooks
    |   |-- applypatch-msg.sample
    |   |-- commit-msg.sample
    |   |-- fsmonitor-watchman.sample
    |   |-- post-update.sample
    |   |-- pre-applypatch.sample
    |   |-- pre-commit.sample
    |   |-- pre-merge-commit.sample
    |   |-- pre-push.sample
    |   |-- pre-rebase.sample
    |   |-- pre-receive.sample
    |   |-- prepare-commit-msg.sample
    |   |-- push-to-checkout.sample
    |   |-- sendemail-validate.sample
    |   `-- update.sample
    |-- info
    |   `-- exclude
    |-- objects
    |   |-- info
    |   `-- pack
    `-- refs
        |-- heads
        `-- tags

10 directories, 18 files
config
this file stores settings such as the location of a remote repository that this repository is linked to.
description
lists the name (and version) of a repository
HEAD
lists a reference to the current checked out commit.
hooks
a directory containing scripts that are executed at various stages (e.g. pre-push.sample is an example of a script executed prior to pushing)
info
contains a file exclude that lists exclusions (files not to be tracked). This is like .gitignore, except is not versioned.
objects
this directory contains SHA indexed files being tracked
refs
a master copy of all the repository refs
logs
contains a history of each branch

The repository that we are going to create in this demonstration could be considered to be a new standalone analysis. In Rstudio, this would be considered a project. So, we will initialise the git repository while we create a new Rstudio project. To do so:

  1. click on the Project selector in the top right of the Rstudio window (as highlighted by the red ellipse in the image below.

  2. select New Project from the dropdown menu

  3. select New Directory form the Create Project panel

  4. select New Project from the Project Type panel

  5. Provide a name for the new directory to be created and use the Browse button to locate a suitable position for this new directory. Ensure that the Create a git repository checkbox is checked

  6. Click the Create Project button

If successful, you should notice a couple of changes - these are highlighted in the following figure:

  • a new Git tab will appear in the top right panel
  • the contents of this newly created project/repository will appear in the Files tab of the bottom right panel

If the files and directories that begin with a . do not appear, click on the More file commands cog and make sure the Show Hidden Files option is ticked.

The newly created files/folders are:

  • .git - this directory houses the repository information and should not generally be edited directly
  • .gitignore - this file defines files/folders to be excluded from the repository. We will discuss this file more later
  • .Rhistory - this file will accrue a history of the commands you have evaluated in R within this project
  • .Rproj.user - this folder stores some project-specific temporary files
  • Repo1.Rproj - contains the project specific settings

Note that on the left side of the Rstudio window there are two panels - one called “Console”, the other called “Terminal”. The console window is for issuing R commands and the terminal window is for issuing system (bash, shell) commands. Throughout this tutorial, as an alternative to using the point and click Rstudio methods, you could instead issue the Terminal instructions into the “Terminal” panel. Indeed, there are some git commands that are not supported directly by Rstudio and can only be entered into the terminal

Note, at this stage, no files are being tracked, that is, they are not part of the repository.

To assist in gaining a greater understanding of the workings of git, we will use a series of schematics diagrams representing the contents of four important sections of the repository. Typically, these figures will be contained within callout panels that expand/collapse upon clicking. However, for this first time, they will be standalone.

In the first figure below, the left hand panel represents the contents of the root directory (excluding the .git folder) - this is the workspace and is currently empty.

The three white panels represent three important parts of the inner structure of the .git folder. A newly initialized repository is relatively devoid of any specific metadata since there are no staged or committed files. In the root of the .git folder, there is a file called HEAD.

The figure is currently very sparse. However, as the repository grows, so the figure will become more complex.

The second figure provides the same information, yet via a network diagram. Again, this will not be overly meaningful until the repository contains some content.

5.2 Initializing other types of repositories

The above demonstrated how to initialise a new local repository from scratch. However, there are times when we instead want to:

  • create a git repository from an existing directory or project
  • collaborate with someone on an existing repository
  • create a remote repository

These situations are briefly demonstrated in the following sections.

5.2.1 Initializing a shared (remote) repository

The main repository for sharing should not contain the working directory as such - only the .git tree and the .gitignore file. Typically the point of a remote repository is to act as a perminantly available repository from which multiple uses can exchange files. Consequently, those accessing this repository should only be able to interact with the .git metadata - they do not directly modify any files.

Since a remote repository is devode of the working files and directories, it is referred to as bare.

To create a bare remote repository, issue the git init --bare command after logging in to the remote location.

git init --bare

Use the instructions for the Terminal

5.2.2 Cloning an existing repository

To get your own local copy of an existing repository, issue the git clone <repo url> command in the root of the working directory you wish to contain the git repository. The repo url points to the location of the existing repository to be cloned. This is also a one time operation and should be issued in an otherwise empty directory.

The repo url can be located on any accessible filesytem (local or remote). The cloning process also stores a link back to the original location of the repository (called origin). This provides a convenient way for the system to keep track of where the local repository should exchange files.

Many git repositories are hosted on sites such as github, gitlab or bitbucket. Within an online git repository, these sites provide url links for cloning.

git clone "url.git"

where "url.git" is the url of the hosted repository.

  1. click on the Project selector in the top right of the Rstudio window (as highlighted by the red ellipse in the image below.
  2. select New Project from the dropdown menu
  3. select Version Control form the Create Project panel
  4. select Git from the Create Project from Version Control panel
  5. paste in the address of the repository that you want to clone, optionally a name for this repository (if you do not like the original name) and use the Browse button to locate a suitable position for this new directory.
  6. Click the Create Project button

5.2.3 Initializing a repository in an existing directory

This is the same as for a new directory.

git init
  1. click on the Project selector in the top right of the Rstudio window (as highlighted by the red ellipse in the image below.
  2. select New Project from the dropdown menu
  3. select Existing Directory form the Create Project panel
  4. use the Browse button to locate the existing directory
  5. Click the Create Project button

6 Tracking files

The basic workflow for tracking files is a two step process in which one or more files are first added to the staging area before they are committed to the local repository. The staging area acts as a little like a snapshot of what the repository will look like once the changes have been committed. The staging area also acts like a buffer between the files in the workspace (actual local copy of files) and the local repository (committed changes).

The reason that this is a two step process is that it allows the user to make edits to numerous files, yet block the commits in smaller chunks to help isolate changes in case there is a need to roll back to previous versions.

6.1 Staging files

When a file is first added to the staging area, a full copy of that file is added to the staging area (not just the file diffs as in other versioning systems).

To demonstrate lets create a file (a simple text file containing the string saying ‘File 1’) and add it to the staging area.

echo 'File 1' > file1

Now lets add this file to the staging area

git add file1

To see the status of the repository (that is, what files are being tracked), we issue the git status command

git status
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
    new file:   file1

This indicates that there is a single file (file1) in the staging area

To demonstrate lets create a file (a simple text file containing the string saying ‘File 1’) and add it to the staging area.

  1. Click the green “New File” button followed by the “Text File” option (or click the equivalent option from the “File” menu)

  2. Type File 1 in the panel with the flashing cursor. This panel represents the contents of the yet to be named file that we are creating.

  3. Click the “Save” or “Save all” buttons (or select the equivalent items from the “File” menu) and name the file “file1”

    Switch to the Git tab and you should notice a number of items (including the file we just created) in the panel. These are files that git is aware of, but not yet tracking. This panel acts as a status window. The yellow “?” symbol indicates that git considers these files “untracked”

  4. To stage a file, click on the corresponding checkbox - the status symbol should change to a green “A” (for added)

Our simple overview schematic represents the staging of file 1.

A schematic of the internal working of git shows in .git/objects a blob has been created. This is a compressed version of file1. Its filename is a 40 digit SHA-1 checksum has representing the contents of the file1. To re-iterate, the blob name is a SHA-1 hash of the file contents (actually, the first two digits form a folder and the remaining 38 form the filename).

We can look at the contents of this blob using the git cat-file command. This command outputs the contents of a compressed object (blob, tree, commit) from either the objects name (or unique fraction thereof) or its tag (we will discuss tags later).

git cat-file blob 50fcd
File 1

The add (staging) process also created a index file. This file simply points to the blob that is part of the snapshot. The git internals schematic illustrates the internal changes in response to staging a file.

6.2 Commit to local repository

To commit a set of changes from the staging area to the local repository, we issue the git commit command. We usually add the -m switch to explicitly supply a message to be associated with the commit. This message should ideally describe what the changes the commit introduces to the repository.

git commit -m 'Initial repo and added file1'
[main (root-commit) fbd5ddb] Initial repo and added file1
 1 file changed, 1 insertion(+)
 create mode 100644 file1

We now see that the status has changed. It indicates that the tree in the workspace is in sync with the repository.

git status
On branch main
nothing to commit, working tree clean

To commit a set of changes from the staging area to the local repository:

  1. click on the “Commit” button to open the “Review Changes” window

    This box will list the files to be committed (in this case “file1”), the changes in this file since the previous commit (as this is the first time this file has been committed, the changes are the file contents)

  2. you should also provide a commit message (in the figure above, I entered “Initial commit”. This message should ideally describe what the changes the commit introduces to the repository.

  3. click the “Commit” button and you will be presented with a popup message.

    This message provides feedback to confirm that your commit was successful.

  4. close the popup window and the “Review Changes” window

file1 should now have disappeared from the git status panel.

Our simple overview schematic represents the staging of file 1.

The following modifications have occurred (in reverse order to how they actually occur):

  • The main branch reference was created. There is currently only a single branch (more on branches later). The branch reference point to (indicates) which commit is the current commit within a branch.

    cat .git/refs/heads/main
    fbd5ddb880ba91b13658b8747292e53ff05bf0e9
  • A commit was created. This points to a tree (which itself points to the blob representing file1) as well as other important metadata (such as who made the commit and when). Since the time stamp will be unique each time a snapshot is commited, so too the name of the commit (as a SHA-1 checksum hash) will differ. To reiterate, the names of blobs and trees are determined by contents alone, commit names are also incorporate commit timestamp and details of the committer - and are thus virtually unique.

    git cat-file commit fbd5d
    tree 07a941b332d756f9a8acc9fdaf58aab5c7a43f64
    author pcinereus <i.obesulus@gmail.com> 1705639467 +0000
    committer pcinereus <i.obesulus@gmail.com> 1705639467 +0000
    
    Initial repo and added file1
  • A tree object was created. This represents the directory tree of the snapshot (commit) and thus points to the blobs.

    git ls-tree fbd5d
    100644 blob 50fcd26d6ce3000f9d5f12904e80eccdc5685dd1  file1

    Or most commonly (if interested in the latest commit):

    git ls-tree HEAD
    100644 blob 50fcd26d6ce3000f9d5f12904e80eccdc5685dd1  file1

The schematic now looks like

Committing staged changes creates an object under the .git tree.

tree -a --charset unicode
.
|-- .git
|   |-- COMMIT_EDITMSG
|   |-- HEAD
|   |-- branches
|   |-- config
|   |-- description
|   |-- hooks
|   |   |-- applypatch-msg.sample
|   |   |-- commit-msg.sample
|   |   |-- fsmonitor-watchman.sample
|   |   |-- post-update.sample
|   |   |-- pre-applypatch.sample
|   |   |-- pre-commit.sample
|   |   |-- pre-merge-commit.sample
|   |   |-- pre-push.sample
|   |   |-- pre-rebase.sample
|   |   |-- pre-receive.sample
|   |   |-- prepare-commit-msg.sample
|   |   |-- push-to-checkout.sample
|   |   |-- sendemail-validate.sample
|   |   `-- update.sample
|   |-- index
|   |-- info
|   |   `-- exclude
|   |-- logs
|   |   |-- HEAD
|   |   `-- refs
|   |       `-- heads
|   |           `-- main
|   |-- objects
|   |   |-- 07
|   |   |   `-- a941b332d756f9a8acc9fdaf58aab5c7a43f64
|   |   |-- 50
|   |   |   `-- fcd26d6ce3000f9d5f12904e80eccdc5685dd1
|   |   |-- fb
|   |   |   `-- d5ddb880ba91b13658b8747292e53ff05bf0e9
|   |   |-- info
|   |   `-- pack
|   `-- refs
|       |-- heads
|       |   `-- main
|       `-- tags
`-- file1

16 directories, 27 files
git cat-file -p HEAD
tree 07a941b332d756f9a8acc9fdaf58aab5c7a43f64
author pcinereus <i.obesulus@gmail.com> 1705639467 +0000
committer pcinereus <i.obesulus@gmail.com> 1705639467 +0000

Initial repo and added file1
git cat-file -p HEAD^{tree}
100644 blob 50fcd26d6ce3000f9d5f12904e80eccdc5685dd1    file1
git log --oneline
fbd5ddb Initial repo and added file1

6.3 More changes

Whenever a file is added or modified, if the changes are to be tracked, the file needs to be added to the staging area. Lets demonstrate by modifying file1 and adding an additional file (file2), this time to a subdirectory (dir1).

echo '---------------' >> file1
mkdir dir1
echo '* Notes' > dir1/file2
git add file1 dir1/file2

Now if we re-examine the status:

git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
    new file:   dir1/file2
    modified:   file1
  1. modify file1 by adding a number of hyphens under the File 1 like in the figure below

  2. save the file. As you do so, you should notice that the file reappears in the status panel (this time with a blue “M” to signify that the file has been modified)

  3. to create the subdirectory, click on the “Add a new folder” icon and then enter a name for the subdirectory in the popup box (as per figure below)

  4. navigate to this new directory (dir1)

  5. click the “Create a new blank file in current directory” button and select “Text file”

  6. enter a new filename (file2) into the popup box

  7. enter some text into this file (like in the figure below)

  8. save the file and notice that the dir1 directory is now also in the git status panel (yet its status is “untracked”)

  9. stage both file1 and dir1 (click on the corresponding checkboxes)

And now our schematic looks like:

So when staging, the following has been performed:

  • the index file has been updated

    git ls-files --stage
    100644 4fcc8f85f738deb6cbb17db1ed3da241ad6cdf39 0 dir1/file2
    100644 28ed2456cbfa8a18a280c8af5b422e91e88ff64d 0 file1
  • two new blobs have been generated. One representing the modified file1 and the other representing the newly created file2 in the dir1 folder. The blob that represented the original file1 contents is still present and indeed is still the one currently committed. Blobs are not erased or modified.

Now we will commit this snapshot.

git commit -m 'Modified file1 and added file2 (in dir1)'
[main 64bffa2] Modified file1 and added file2 (in dir1)
 2 files changed, 2 insertions(+)
 create mode 100644 dir1/file2
  1. click the “Commit” button

  2. you might like to explore the changes associated with each file

  3. enter a commit message (as in the figure below)

  4. click the “Commit” button

  5. after checking that the “Git Commit” popup does not contain any errors, close the popup

  6. to explore the repository history, click towards the “History” button on the top left corner of the “Review Changes” window

    This provides a graphical list of commits (in reverse chronological order)

  7. once you have finished exploring the history, you can close the “Review Changes” window

The following modifications occur:

  • the master branch now points to the new commit.

    cat .git/refs/heads/main
    64bffa244a1123c348ee04c9a10abd1012cb1c9d
    git reflog
    64bffa2 HEAD@{0}: commit: Modified file1 and added file2 (in dir1)
    fbd5ddb HEAD@{1}: commit (initial): Initial repo and added file1
  • a new commit was created. This points to a new root tree object and also points to the previous commit (its parent).

    git cat-file commit 64bff
    tree 2b61e2b3db9d1708269cf9d1aeaae2b0a2af1a23
    parent fbd5ddb880ba91b13658b8747292e53ff05bf0e9
    author pcinereus <i.obesulus@gmail.com> 1705639473 +0000
    committer pcinereus <i.obesulus@gmail.com> 1705639473 +0000
    
    Modified file1 and added file2 (in dir1)
  • new root tree was created. This points to a blob representing the modified file1 as well as a newly created sub-directory tree representing the dir1 folder.

    git ls-tree 2b61e
    040000 tree f2fa54609fe5e918f365e0d5ffaf9a3aea88d541  dir1
    100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d  file1
    git cat-file -p HEAD^{tree}
    040000 tree f2fa54609fe5e918f365e0d5ffaf9a3aea88d541  dir1
    100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d  file1
  • a new sub-directory root tree was created. This points to a blob representing the modified file1 as well as a newly created subtree tree representing the file2 file within the dir1 folder.

    git ls-tree 64bff
    040000 tree f2fa54609fe5e918f365e0d5ffaf9a3aea88d541  dir1
    100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d  file1

    OR,

    git ls-tree HEAD
    040000 tree f2fa54609fe5e918f365e0d5ffaf9a3aea88d541  dir1
    100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d  file1

Committing staged changes creates an object under the .git tree.

tree -a --charset unicode
.
|-- .git
|   |-- COMMIT_EDITMSG
|   |-- HEAD
|   |-- branches
|   |-- config
|   |-- description
|   |-- hooks
|   |   |-- applypatch-msg.sample
|   |   |-- commit-msg.sample
|   |   |-- fsmonitor-watchman.sample
|   |   |-- post-update.sample
|   |   |-- pre-applypatch.sample
|   |   |-- pre-commit.sample
|   |   |-- pre-merge-commit.sample
|   |   |-- pre-push.sample
|   |   |-- pre-rebase.sample
|   |   |-- pre-receive.sample
|   |   |-- prepare-commit-msg.sample
|   |   |-- push-to-checkout.sample
|   |   |-- sendemail-validate.sample
|   |   `-- update.sample
|   |-- index
|   |-- info
|   |   `-- exclude
|   |-- logs
|   |   |-- HEAD
|   |   `-- refs
|   |       `-- heads
|   |           `-- main
|   |-- objects
|   |   |-- 07
|   |   |   `-- a941b332d756f9a8acc9fdaf58aab5c7a43f64
|   |   |-- 28
|   |   |   `-- ed2456cbfa8a18a280c8af5b422e91e88ff64d
|   |   |-- 2b
|   |   |   `-- 61e2b3db9d1708269cf9d1aeaae2b0a2af1a23
|   |   |-- 4f
|   |   |   `-- cc8f85f738deb6cbb17db1ed3da241ad6cdf39
|   |   |-- 50
|   |   |   `-- fcd26d6ce3000f9d5f12904e80eccdc5685dd1
|   |   |-- 64
|   |   |   `-- bffa244a1123c348ee04c9a10abd1012cb1c9d
|   |   |-- f2
|   |   |   `-- fa54609fe5e918f365e0d5ffaf9a3aea88d541
|   |   |-- fb
|   |   |   `-- d5ddb880ba91b13658b8747292e53ff05bf0e9
|   |   |-- info
|   |   `-- pack
|   `-- refs
|       |-- heads
|       |   `-- main
|       `-- tags
|-- dir1
|   `-- file2
`-- file1

22 directories, 33 files
git cat-file -p HEAD
tree 2b61e2b3db9d1708269cf9d1aeaae2b0a2af1a23
parent fbd5ddb880ba91b13658b8747292e53ff05bf0e9
author pcinereus <i.obesulus@gmail.com> 1705639473 +0000
committer pcinereus <i.obesulus@gmail.com> 1705639473 +0000

Modified file1 and added file2 (in dir1)
git cat-file -p HEAD^{tree}
040000 tree f2fa54609fe5e918f365e0d5ffaf9a3aea88d541    dir1
100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d    file1
git log --oneline
64bffa2 Modified file1 and added file2 (in dir1)
fbd5ddb Initial repo and added file1

Now you might be wondering… What if I have modified many files and I want to stage them all. Do I really have to add each file individually? Is there not some way to add multiple files at a time? The answer of course is yes. To stage all files (including those in subdirectories) we issue the git add . command (notice the dot).

git add .

6.4 .gitignore

Whilst it is convenient to not have to list every file that you want to be staged (added), what about files that we don’t want to get staged and committed. It is also possible to define a file (called .gitignore) that is a list of files (or file patterns) that are to be excluded when we request all files be added. This functionality is provided via the .gitignore file that must be in the root of the repository working directory.

For example, we may have temporary files or automatic backup files or files generated as intermediates in a compile process etc that get generated. These files are commonly generated in the process of working with files in a project, yet we do not necessarily wish for them to be tracked. Often these files have very predictable filename pattern (such as ending with a # or ~ symbol or having a specific file extension such as .aux.

As an example, when working with a project in Rstudio, files (such as .Rhistory) and directories (such as .Rproj.user) are automatically added to the file system and thus appear as untracked files in git status.

Hence, we can create a.gitignore to exclude these files/directories. Indeed, if you are using Rstudio, you might have noticed that a .gitignore file was automatically created when you created the project.

Lets start by modifying the file2 and creating a new file f.tmp (that we want to ignore).

echo '---' >> dir1/file2
echo 'temp' > dir1/f.tmp
  1. navigate to the dir1 directory and open file2 for editing (or just make sure you are on the file2 tab.
  2. edit the file such that it just contains three hyphens (---) before saving the file
  3. in the same dir1 directory add another new text file (f.tmp) and edit this file to contain the word temp (then save the file)

The Git status panel should display both of these as untracked files.

To ignore the f.tmp file, we could either explicitly add this file as a row in a .gitignore file, or else we could supply a wildcard version that will ignore all files ending in .tmp.

echo '*.tmp' > .gitignore
cat .gitignore
*.tmp
  1. navigate back to the root of the project

  2. click on the gitignore file to open it up for editing

  3. navigate to the end of this file and add a newline containing the text *.tmp

You will notice that this .gitignore file already had items in it before you started editing it. These were added by Rstudio when you first created the new project.

The first item is .Rproj.user and its presence in this file is why it does not appear in the git status panel.

Once we save the .gitignore file, notice how the f.tmp file is similarly removed from the git status panel - since via .gitignore we have indicated that we want to ignore this file (not track it as part of our version control system).

Entry Meaning
file1 DO NOT stage (add) file1
*.tmp DO NOT stage (add) any file ending in .tmp
/dir1/* DO NOT stage (add) the folder called dir1 (or any of its contents) unless this is specifically negated (see next line)
!/dir1/file2 DO stage (add) the file called file2 in the dir1 folder

Now when we go to add all files to the staging area, those that fall under the exclude rules will be ignored

git add .
git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
    new file:   .gitignore
    modified:   dir1/file2

You will notice that .gitignore was added as a new file and dir1/file2 was marked as modified yet dir1/f.tmp was totally ignored.

You will notice that .gitignore was added as a new file and dir1/file2 was marked as modified yet dir1/f.tmp was totally ignored.

  1. check the boxes next to each of the files listed in the status panel

Lets now commit these changes.

git commit -m 'Modified file2, added .gitignore'
[main 778d70d] Modified file2, added .gitignore
 2 files changed, 2 insertions(+)
 create mode 100644 .gitignore
git status
On branch main
nothing to commit, working tree clean
  1. click on the “Commit” button
  2. add a commit message (such as Modified file2, added .gitignore)
  3. click the “Commit” button
  4. close the popup
  5. close the “Review Changes” window

For those still interested in the schematic…

Committing staged changes creates an object under the .git tree.

tree -a --charset unicode
.
|-- .git
|   |-- COMMIT_EDITMSG
|   |-- HEAD
|   |-- branches
|   |-- config
|   |-- description
|   |-- hooks
|   |   |-- applypatch-msg.sample
|   |   |-- commit-msg.sample
|   |   |-- fsmonitor-watchman.sample
|   |   |-- post-update.sample
|   |   |-- pre-applypatch.sample
|   |   |-- pre-commit.sample
|   |   |-- pre-merge-commit.sample
|   |   |-- pre-push.sample
|   |   |-- pre-rebase.sample
|   |   |-- pre-receive.sample
|   |   |-- prepare-commit-msg.sample
|   |   |-- push-to-checkout.sample
|   |   |-- sendemail-validate.sample
|   |   `-- update.sample
|   |-- index
|   |-- info
|   |   `-- exclude
|   |-- logs
|   |   |-- HEAD
|   |   `-- refs
|   |       `-- heads
|   |           `-- main
|   |-- objects
|   |   |-- 07
|   |   |   `-- a941b332d756f9a8acc9fdaf58aab5c7a43f64
|   |   |-- 14
|   |   |   `-- 3a8bb5a2cc05a91f83a87af18c8eb5885a375c
|   |   |-- 19
|   |   |   `-- 44fd61e7c53bcc19e6f3eb94cc800508944a25
|   |   |-- 28
|   |   |   `-- ed2456cbfa8a18a280c8af5b422e91e88ff64d
|   |   |-- 2b
|   |   |   `-- 61e2b3db9d1708269cf9d1aeaae2b0a2af1a23
|   |   |-- 3c
|   |   |   `-- 7af0d3ccea71c9af82fa0ce68532272edcf1b8
|   |   |-- 4f
|   |   |   `-- cc8f85f738deb6cbb17db1ed3da241ad6cdf39
|   |   |-- 50
|   |   |   `-- fcd26d6ce3000f9d5f12904e80eccdc5685dd1
|   |   |-- 64
|   |   |   `-- bffa244a1123c348ee04c9a10abd1012cb1c9d
|   |   |-- 77
|   |   |   `-- 8d70d3a1103eee2740a68cc1d18c516502eece
|   |   |-- c4
|   |   |   `-- 26a67af50d13828ec73b3c560b2648e2f3dc08
|   |   |-- f2
|   |   |   `-- fa54609fe5e918f365e0d5ffaf9a3aea88d541
|   |   |-- fb
|   |   |   `-- d5ddb880ba91b13658b8747292e53ff05bf0e9
|   |   |-- info
|   |   `-- pack
|   `-- refs
|       |-- heads
|       |   `-- main
|       `-- tags
|-- .gitignore
|-- dir1
|   |-- f.tmp
|   `-- file2
`-- file1

27 directories, 40 files
git cat-file -p HEAD
tree 3c7af0d3ccea71c9af82fa0ce68532272edcf1b8
parent 64bffa244a1123c348ee04c9a10abd1012cb1c9d
author pcinereus <i.obesulus@gmail.com> 1705639475 +0000
committer pcinereus <i.obesulus@gmail.com> 1705639475 +0000

Modified file2, added .gitignore
git cat-file -p HEAD^{tree}
100644 blob 1944fd61e7c53bcc19e6f3eb94cc800508944a25    .gitignore
040000 tree c426a67af50d13828ec73b3c560b2648e2f3dc08    dir1
100644 blob 28ed2456cbfa8a18a280c8af5b422e91e88ff64d    file1
git log --oneline
778d70d Modified file2, added .gitignore
64bffa2 Modified file1 and added file2 (in dir1)
fbd5ddb Initial repo and added file1

7 Inspecting a repository

For this section, will will be working on the repository built up in the previous section. If you did not follow along with the previous section, I suggest that you expand the following callout and run the provided code in a terminal.

If you already have the repository, you can ignore the commands to create the repository.

Issue the following commands in your terminal

rm -rf ~/tmp/Repo1
mkdir ~/tmp/Repo1
cd ~/tmp/Repo1
git init 
echo 'File 1' > file1
git add file1
git commit -m 'Initial repo and added file1'
echo '---------------' >> file1
mkdir dir1
echo '* Notes' > dir1/file2
git add file1 dir1/file2
git commit -m 'Modified file1 and added file2 (in dir1)'
echo '---' >> dir1/file2
echo 'temp' > dir1/f.tmp
echo '*.tmp' > .gitignore
git add .
git commit -m 'Modified file2, added .gitignore'
tree -ra -L 2 --charset ascii
.
|-- file1
|-- dir1
|   |-- file2
|   `-- f.tmp
|-- .gitignore
`-- .git
    |-- refs
    |-- objects
    |-- logs
    |-- info
    |-- index
    |-- hooks
    |-- description
    |-- config
    |-- branches
    |-- HEAD
    `-- COMMIT_EDITMSG

8 directories, 9 files

7.1 Status of workspace and staging area

Recall that within the .git environment, files can be in one of four states:

  • untracked
  • modified
  • staged
  • committed

To inspect the status of files in your workspace, you can issue the git status command (as we have done on numerous occasions above). This command displays the current state of the workspace and staging area.

git status
On branch main
nothing to commit, working tree clean

The output of git status partitions all the files into (staged: Changes to be committed, unstaged: Changes not staged for commit and Untracked) as well as hints on how to either promote or demote the status of these files.

Examine the git status panel - ideally it should be empty thereby signalling that all your important files are tracked andcommitted.

7.1.1 log of commits

The git log command allows us to review the history of committed snapshots

git log
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)

commit fbd5ddb880ba91b13658b8747292e53ff05bf0e9
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:27 2024 +0000

    Initial repo and added file1

We can see that in my case some fool called ‘Murray Logan’ has made a total of three commits. We can also see the date/time that the commits were made as well as the supplied commit comment.

Over time repositories accumulate a large number of commits, to only review the last 2 commits, we could issue the git log -n 2 command.

git log -n 2 
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)
Option Example
--oneline
Condensed view
git log --oneline
778d70d Modified file2, added .gitignore
64bffa2 Modified file1 and added file2 (in dir1)
fbd5ddb Initial repo and added file1
--stat
Indicates number of changes
git log --stat
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

 .gitignore | 1 +
 dir1/file2 | 1 +
 2 files changed, 2 insertions(+)

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)

 dir1/file2 | 1 +
 file1      | 1 +
 2 files changed, 2 insertions(+)

commit fbd5ddb880ba91b13658b8747292e53ff05bf0e9
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:27 2024 +0000

    Initial repo and added file1

 file1 | 1 +
 1 file changed, 1 insertion(+)
-p
Displays the full diff of each commit
git log -p
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..1944fd6
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+*.tmp
diff --git a/dir1/file2 b/dir1/file2
index 4fcc8f8..143a8bb 100644
--- a/dir1/file2
+++ b/dir1/file2
@@ -1 +1,2 @@
 * Notes
+---

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)

diff --git a/dir1/file2 b/dir1/file2
new file mode 100644
index 0000000..4fcc8f8
--- /dev/null
+++ b/dir1/file2
@@ -0,0 +1 @@
+* Notes
diff --git a/file1 b/file1
index 50fcd26..28ed245 100644
--- a/file1
+++ b/file1
@@ -1 +1,2 @@
 File 1
+---------------

commit fbd5ddb880ba91b13658b8747292e53ff05bf0e9
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:27 2024 +0000

    Initial repo and added file1

diff --git a/file1 b/file1
new file mode 100644
index 0000000..50fcd26
--- /dev/null
+++ b/file1
@@ -0,0 +1 @@
+File 1
--author="<name>"
Filter by author
git log --author="Murray"
--grep="<pattern>"
Filter by regex pattern of commit message
git log --grep="Modified"
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)
<file>
Filter by filename
git log file1
commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)

commit fbd5ddb880ba91b13658b8747292e53ff05bf0e9
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:27 2024 +0000

    Initial repo and added file1
--decorate --graph
git log --graph --decorate --oneline
* 778d70d (HEAD -> main) Modified file2, added .gitignore
* 64bffa2 Modified file1 and added file2 (in dir1)
* fbd5ddb Initial repo and added file1
--all
All branches
git log --all
commit 778d70d3a1103eee2740a68cc1d18c516502eece
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:35 2024 +0000

    Modified file2, added .gitignore

commit 64bffa244a1123c348ee04c9a10abd1012cb1c9d
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:33 2024 +0000

    Modified file1 and added file2 (in dir1)

commit fbd5ddb880ba91b13658b8747292e53ff05bf0e9
Author: pcinereus <i.obesulus@gmail.com>
Date:   Fri Jan 19 04:44:27 2024 +0000

    Initial repo and added file1

To explore the history of a repository, click on the clock icon (“View history of previous commits” button). This will open up the “Review Changes” window in the “History” tab.

Along with the reverse chronological list of commits, for each commit (and file thereof), you can explore the changes (diffs) that occurred.

Text that appears over a green background represents text that have been added as part of the current commit. Text that appears over a red background represents text that have been removed.

If we scroll down and explore the changes in dir1/file2 for the most recent commit, we see that the text * Notes was removed and then * Notes and --- were added. At first this might seem a bit odd - why was * Notes deleted and then added back?

Git works on entire lines of text. So the first line was replaced because in the newer version, the first line had a carriage return (newline character). Although we cant see this character, it is there - we see it more via its effect (sending the text after it to the next line). Hence, in fact, two lines of text were actually changed in the most recent commit.

7.1.2 reflog

Another way to explore the commit history is to look at the reflog. This is a log of the branch references. This approach is more useful when we have multiple branches and so will be visited in the section on branching. It displays all repository activity, not just the commits.

git reflog
778d70d HEAD@{0}: commit: Modified file2, added .gitignore
64bffa2 HEAD@{1}: commit: Modified file1 and added file2 (in dir1)
fbd5ddb HEAD@{2}: commit (initial): Initial repo and added file1

Some of this sort of information can be gleaned from the git “History”. Just make sure you select (“all branches”) from the “Switch branch” menu.

7.1.3 diff

Whilst some of these actions described in this section are available from the “History” tab of the “Review Changes” window in Rstudio, most are only available as terminal commands.

Two of the three commits in our repository involved modifications to a file (dir1/file2). To further help illustrate commands to compare files indifferent states, we will additionally make a further change to dir1/file2. The git diff allows us to explore differences between:

  • the workspace and the staging area (index)

    # lets modify dir1/file2
    echo 'Notes' >> dir1/file2
    git diff
    diff --git a/dir1/file2 b/dir1/file2
    index 143a8bb..f12af0a 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1,2 +1,3 @@
     * Notes
     ---
    +Notes

    The output indicates that we are comparing the blob representing dir1/file2 in the index (staging area) with the newly modified dir1/file2. The next couple of rows indicate that the indexed version will be represented by a ‘-’ sign and the new version will be represented by a ‘+’ sign. The next row (which is surrounded in a pair of @ signs, indicates that there are two lines that have changed. Finally the next two rows show that a charrage return has been added to the end of the first line and the new version has added the word ‘Notes’ to the next line.

  • the staging area and the last commit

    git add .
    git diff --cached
    diff --git a/dir1/file2 b/dir1/file2
    index 143a8bb..f12af0a 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1,2 +1,3 @@
     * Notes
     ---
    +Notes

    Once we stage the modifications, we see that the same differences are recorded.

  • the index and a tree (in this case, the current tree)

    git diff --cached HEAD^{tree}
    diff --git a/dir1/file2 b/dir1/file2
    index 143a8bb..f12af0a 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1,2 +1,3 @@
     * Notes
     ---
    +Notes
  • the workspace and the current commit

    git diff HEAD
    diff --git a/dir1/file2 b/dir1/file2
    index 143a8bb..f12af0a 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1,2 +1,3 @@
     * Notes
     ---
    +Notes
  • two commits (e.g. previous and current commits)

    git diff HEAD^ HEAD
    diff --git a/.gitignore b/.gitignore
    new file mode 100644
    index 0000000..1944fd6
    --- /dev/null
    +++ b/.gitignore
    @@ -0,0 +1 @@
    +*.tmp
    diff --git a/dir1/file2 b/dir1/file2
    index 4fcc8f8..143a8bb 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1 +1,2 @@
     * Notes
    +---
  • two trees (first example, the current and previous commit trees)

    git diff HEAD^{tree} HEAD^^{tree}
    diff --git a/.gitignore b/.gitignore
    deleted file mode 100644
    index 1944fd6..0000000
    --- a/.gitignore
    +++ /dev/null
    @@ -1 +0,0 @@
    -*.tmp
    diff --git a/dir1/file2 b/dir1/file2
    index 143a8bb..4fcc8f8 100644
    --- a/dir1/file2
    +++ b/dir1/file2
    @@ -1,2 +1 @@
     * Notes
    ----
    git diff 07a94 2b61e
    diff --git a/dir1/file2 b/dir1/file2
    new file mode 100644
    index 0000000..4fcc8f8
    --- /dev/null
    +++ b/dir1/file2
    @@ -0,0 +1 @@
    +* Notes
    diff --git a/file1 b/file1
    index 50fcd26..28ed245 100644
    --- a/file1
    +++ b/file1
    @@ -1 +1,2 @@
     File 1
    +---------------
  • two blobs (indeed any two objects)

    git diff 50fcd 28ed2
    diff --git a/50fcd b/28ed2
    index 50fcd26..28ed245 100644
    --- a/50fcd
    +++ b/28ed2
    @@ -1 +1,2 @@
     File 1
    +---------------

7.1.4 ls-files

Similar to the previous section, the following is only really available via the terminal.

We can list the files that comprise the repository by:

git ls-files 
.gitignore
dir1/file2
file1

The change to dir1/file2 above was just to illustrate the git diff. In doing so we now have a modified version of this file that has not been committed Before we move on, I am going to remove these changes so that the dir1/file2 is not in a modified state and reflects the state of the current commit. To do so, I will use perform a hard reset (git reset --hard). More will be discussed about the git reset command later in this tutorial - for now all that is important is to know that it restores the workspace to a previous state.

In addition to the git reset --hard, I will also clean and prune the repository.

git reset --hard 
git clean -qfdx
git reflog expire --expire-unreachable=now --all
git gc --prune=now
HEAD is now at 778d70d Modified file2, added .gitignore